NYC Accidents Data Exploration

GitHub scrubs the Javascript so you won't be able to see the map from GitHub directly. Go to this link to view the visualization: https://nbviewer.jupyter.org/github/JasonSanchez/dangerous-intersections/blob/master/accidents/NYC%20Accident%20Data%20Exploration.ipynb.


In [2]:
# Import dependencies
import folium
import numpy as np
import pandas as pd

# Load accident data.
accident_data = pd.read_csv('./data/NYPD_Motor_Vehicle_Collisions_sampled.csv')

In [23]:
accident_data.head()


Out[23]:
DATE TIME BOROUGH ZIP CODE LATITUDE LONGITUDE LOCATION ON STREET NAME CROSS STREET NAME OFF STREET NAME ... CONTRIBUTING FACTOR VEHICLE 2 CONTRIBUTING FACTOR VEHICLE 3 CONTRIBUTING FACTOR VEHICLE 4 CONTRIBUTING FACTOR VEHICLE 5 UNIQUE KEY VEHICLE TYPE CODE 1 VEHICLE TYPE CODE 2 VEHICLE TYPE CODE 3 VEHICLE TYPE CODE 4 VEHICLE TYPE CODE 5
0 06/18/2016 5:20 BRONX 10456 40.824067 -73.908710 (40.8240665, -73.9087095) EAST 163 STREET 3 AVENUE NaN ... Unspecified NaN NaN NaN 3463614 PASSENGER VEHICLE NaN NaN NaN NaN
1 06/18/2016 7:10 BRONX 10472 40.826916 -73.872030 (40.8269163, -73.8720302) METCALF AVENUE WATSON AVENUE NaN ... Unspecified NaN NaN NaN 3464214 PASSENGER VEHICLE PASSENGER VEHICLE NaN NaN NaN
2 06/18/2016 7:20 NaN NaN 40.701455 -73.989620 (40.7014547, -73.9896203) NaN NaN NaN ... Unspecified NaN NaN NaN 3463782 PASSENGER VEHICLE PASSENGER VEHICLE NaN NaN NaN
3 06/18/2016 7:30 NaN NaN NaN NaN NaN 47 STREET NaN NaN ... Unspecified NaN NaN NaN 3465413 PASSENGER VEHICLE OTHER NaN NaN NaN
4 06/18/2016 7:45 QUEENS 11422 40.665256 -73.735334 (40.665256, -73.7353338) SOUTH CONDUIT AVENUE FRANCIS LEWIS BOULEVARD NaN ... Unspecified NaN NaN NaN 3463318 PASSENGER VEHICLE PASSENGER VEHICLE NaN NaN NaN

5 rows × 29 columns


In [95]:
accident_data.describe()


Out[95]:
ZIP CODE LATITUDE LONGITUDE NUMBER OF PERSONS INJURED NUMBER OF PERSONS KILLED NUMBER OF PEDESTRIANS INJURED NUMBER OF PEDESTRIANS KILLED NUMBER OF CYCLIST INJURED NUMBER OF CYCLIST KILLED NUMBER OF MOTORIST INJURED NUMBER OF MOTORIST KILLED UNIQUE KEY
count 720386.000000 771401.000000 771401.000000 975764.000000 975764.000000 975764.000000 975764.000000 975764.000000 975764.000000 975764.000000 975764.000000 975764.000000
mean 10808.078445 40.722982 -73.923256 0.255354 0.001224 0.053982 0.000691 0.020909 0.000076 0.191262 0.000460 2029945.753199
std 566.952546 0.077370 0.086025 0.656131 0.036760 0.246789 0.026545 0.151044 0.008708 0.663912 0.024059 1515381.600845
min 10000.000000 40.498949 -74.254532 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 22.000000
25% 10075.000000 40.669100 -73.979237 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 245746.750000
50% 11204.000000 40.723494 -73.933938 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3122593.500000
75% 11236.000000 40.765579 -73.869941 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3367943.250000
max 11697.000000 40.912869 -73.700597 43.000000 5.000000 15.000000 2.000000 6.000000 1.000000 43.000000 5.000000 3612908.000000

In [15]:
# Num rows in data.
print(accident_data.count())


DATE                             10000
TIME                             10000
BOROUGH                           6396
ZIP CODE                          6395
LATITUDE                          6687
LONGITUDE                         6687
LOCATION                          6687
ON STREET NAME                    6610
CROSS STREET NAME                 5197
OFF STREET NAME                   2174
NUMBER OF PERSONS INJURED        10000
NUMBER OF PERSONS KILLED         10000
NUMBER OF PEDESTRIANS INJURED    10000
NUMBER OF PEDESTRIANS KILLED     10000
NUMBER OF CYCLIST INJURED        10000
NUMBER OF CYCLIST KILLED         10000
NUMBER OF MOTORIST INJURED       10000
NUMBER OF MOTORIST KILLED        10000
CONTRIBUTING FACTOR VEHICLE 1     9950
CONTRIBUTING FACTOR VEHICLE 2     8320
CONTRIBUTING FACTOR VEHICLE 3      700
CONTRIBUTING FACTOR VEHICLE 4      196
CONTRIBUTING FACTOR VEHICLE 5       40
UNIQUE KEY                       10000
VEHICLE TYPE CODE 1               9779
VEHICLE TYPE CODE 2               7002
VEHICLE TYPE CODE 3                623
VEHICLE TYPE CODE 4                173
VEHICLE TYPE CODE 5                 34
dtype: int64

Map data


In [3]:
# Map data.

# Starting coordinates to load map view.
NYC_coordinates = (40.7142700, -74.0059700)

# Create Map object.
map = folium.Map(location=NYC_coordinates,
                     zoom_start=12)

# Plot accidents.
# Limit number of points to plot for testing.
MAX_RECORDS = 1000
marker_cluster = folium.MarkerCluster().add_to(map)
for row in accident_data[0:MAX_RECORDS].iterrows():
    # Only plot point if lat/long is available.
    if (not np.isnan(row[1]['LATITUDE']) and not np.isnan(row[1]['LONGITUDE'])):
        accident_metadata = """
                <ul>
                    <li><strong>On street</strong>: {0}</li>
                    <li><strong>Cross street</strong>: {1}</li>
                    <li><strong>Reason</strong>: {2}</li>
                </ul>""".format(
            str(row[1]['ON STREET NAME']), str(row[1]['CROSS STREET NAME']),
            str(row[1]['CONTRIBUTING FACTOR VEHICLE 1']))
        iframe = folium.element.IFrame(html=accident_metadata, width=250, height=100)
        popup = folium.Popup(iframe, max_width=2650)
        folium.Marker(
                location = [row[1]['LATITUDE'], row[1]['LONGITUDE']],
                icon = folium.Icon(color='red', icon='asterisk'),
                popup=popup).add_to(marker_cluster)

map


Out[3]:

In [4]:
# Save html version of map.
map.save('accidents_map.html')

In [ ]: